Sparse Matrix Operations on Multi-core Architectures

نویسندگان

  • Carsten Trinitis
  • Tilman Küstner
  • Josef Weidendorfer
  • Jasmin Smajic
چکیده

This paper compares various contemporary multi-core based microprocessor architectures with different memory interconnects regarding performance, speedup, and parallel efficiency. Sparse matrix operations are used as a benchmark application from the area of electrical engineering. Within this context, thread to core pinnning and cache optimization are two important aspects which are investigated in more detail.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ViennaCL - Linear Algebra Library for Multi- and Many-Core Architectures

CUDA, OpenCL, and OpenMP are popular programming models for the multi-core architectures of CPUs and many-core architectures of GPUs or Xeon Phis. At the same time, computational scientists face the question of which programming model to use to obtain their scientific results. We present the linear algebra library ViennaCL, which is built on top of all three programming models, thus enabling co...

متن کامل

cient Sparse Matrix - Matrix Multiplication on Multicore Architectures ⇤

We describe a new parallel sparse matrix-matrix multiplication algorithm in shared memory using a quadtree decomposition. Our implementation is nearly as fast as the best sequential method on one core, and scales quite well to multiple cores.

متن کامل

Efficient Sparse Matrix-Matrix Multiplication on Multicore Architectures∗

We describe a new parallel sparse matrix-matrix multiplication algorithm in shared memory using a quadtree decomposition. Our preliminary implementation is nearly as fast as the best sequential method on one core, and scales well to multiple cores.

متن کامل

Parallel finite element technique using Gaussian belief propagation

The computational efficiency of Finite Element Methods (FEMs) on parallel architectures is severely limited by conventional sparse iterative solvers. Conventional solvers are based on a sequence of global algebraic operations that limits their parallel efficiency. Traditionally, sophisticated programming techniques tailored to specific CPU architectures are used to improve the poor performance ...

متن کامل

Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures

We investigate the performance of the routines in LAPACK and the Successive Band Reduction (SBR) toolbox for the reduction of a dense matrix to tridiagonal form, a crucial preprocessing stage in the solution of the symmetric eigenvalue problem, on generalpurpose multi-core processors. In response to the advances of hardware accelerators, we also modify the code in the SBR toolbox to accelerate ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009